Taking the Toys Away

File:Newton I-90 Exit 17, Guardrail Repair 2, May 18, 2014 (14219030752).jpg
Newton I-90 Exit 17, Guardrail Repair 2, May 18, 2014 (14219030752) [commons]

One of the bigger things I’ve built around the technology of LLMs has been a BDD-style framework. Since I have been interested in testing topics forever, this application of the technology seemed an obvious thing to do.

To recap quickly, when I say BDD-style framework I mean something like Cucumber, where the vision was to be able to formulate user stories including acceptance criteria in plain, but somewhat rigorous, English, and have these texts map to tests. This, in order to bridge the gap between product managers, development and QA.

This approach required a tedious markup step for which additional glue code had to be written. For example, when a “user clicks here” and sees the price to be “114.00”, a developer needed to translate that into function calls to send that data or fetch it from somewhere. I have never used this approach for real, but can easily imagine why this whole topic never really seemed to have fully caught on. Just too laborious to keep this all maintained.

LLMs to the rescue, I thought, to provide a shield against all sorts of changes, from spelling mistakes, mis-mappings, to changes in CSS selectors to which UI test elements refer, to changes in the behaviour of the application which are not subject to the issue of the test at hand, but still would influence its outcome under normal circumstances—which means rework. So, what I started with, was a high level description, the user story, plus some testnotes which explain some app behaviour to the agent in some more detail (yes, you might rightfully call it gluecode, no worries), give the agent access to a browser, and let it run off!

To make it all work, of course, like always, a lot of challenges had to be overcome. The devil, they say, is in the details (and they are right). However, I found trying to make that work quite an interesting experience. Because it was all about making Claude, which I used as my agent, do exactly what I needed it to do, while leaving it a little bit of wiggle-room (after all, the whole idea was about some wiggle-room). This was a bit the opposite of whatever happened at that time in the area of vibecoding, where you literally don’t care too much about details, only outcomes. More generally, with AI-assisted coding (which is vibecoding’s more respectable sibling, to be used in higher-stakes settings) you want a lot more control, but still want to profit from the agent’s creativity.

There is a lot of talk about how models can be made safe, about how to make them not do anything dangerous, and so on, but my perception of current thinking is that nothing really can substitute setting up an environment in which an agent can only do certain things, which is to say—put hard constraints in place. I.e. the agent can only access certain directories, can only query certain websites etc. Depending on what you do, you may have varying requirements and tolerance. For example, Steve Yegge famously boasted about giving his agents * permissions. Fair enough for vibecoding. But he also says that engineering is about making unreliable things reliable, so it really depends on a concrete risk-benefit assessment.

In any case, what I’ve learned from my test-experiment was that to make the agent do what you want it to do is by revoking options. By putting constraints and guardrails in place at every corner. The agent should only use the browser, and maybe patch some data. It shouldn’t touch files, shouldn’t modify them, should not explore “alternative approaches.” Everything was about limiting the agent’s options and ways of doing stuff.

Concretely, for working with Claude, this means making heavy use of the permissions system. But apart from the tests, more recently I’ve started experimenting with containerisation, since containers give you an environment which allows for effective control. What you can do here is that in addition to constraining filesystem access and network traffic, route all Bash commands which the agent attempts to run, through the container.

Anyways, in life, as in art, limitless options are a source of instability. In art, it is a known, for example, that artists work better within given constraints, rather than without them. Without boundaries, they go nuts. As so often things boil down to simple principles. Like for example ‘propose and reject’, which showed up as Generative Adversarial Networks in machine learning. Here in our case it is also the pattern of flipping things on their heads, like it has been done with ‘inversion of control’. Or in subtractive synthesis. You generate, and then you subtract.

To sum up: “Taking the toys away” was the literal one simple trick I went away with which made all the difference.

Taking the Toys Away

# Comments